
# README for Replication files for: "Trade, Policy, and Development in the Digital Economy" 
Peter Herman and Sarah Oliver, https://doi.org/10.1016/j.jdeveco.2023.103135




# Code files and replication instructions
In most cases, the analysis can be run by placing all code and data files in a common directory and setting that directory and adding that location to the code files. In the Stata files, replace the current working director with this directory (cd = "directory path"). In python, replace the location defined as "root_directory" at the beginning of the files. 

## 1. main_regression_tables.do
Stata do file used to estimate the regressions presented in tables 2 through 6 via ppmlhdfe. Also includes two regressions rerun using ppml_panel_sg in order to produce exporter-importer fixed effects for use in the counterfactual simulations.

**Data inputs**
* aggregate_trade_data.dta
* goods_services_trade_data.dta

**Outputs**
* 6 tables of regression results, saved as 6 .csv files
* 5 datasets containing additional regression results for use in the counterfactual simulations, saved as 5 .csv files
    
## 2. robustness_check_tables.do
Stata do file used to estimate the regressions presented in tables A1, A3, A5, A6 and A7, and summary statistics in tables A2 and A4. 

**Data inputs**
* aggregate_trade_data.dta
* goods_services_trade_data.dta 
* digital_trade_sector.dta
* trade_counts.dta 
* stri.dta
* kee_tri.dta
* IT_use_split.dta

**Outputs**
* 5 tables of regression results, saved as 7 .csv files (Flex models in A1 each saved in separate .csv files)



## 3. simulation_Nigeria.py
Python file used to run the first counterfactual simulation involving Nigeria's internet use.

**Data inputs**
* all_xvalues_aggregate.csv
* all_countrypairFE_aggregate.csv
* beta_ests_aggregate.csv
* WB_internet.csv
* eora_trade_2016.csv
* country_year_coverage_v2_0.csv

**Outputs**
* A collection of 3 tables containing different types simulation results in different formats



## 4. simulation_India_Japan.py
Python file used to run the first counterfactual simulation involving digital provisions in the India-Japan FTA.

**Data inputs**
* all_xvalues_aggregate.csv
* all_countrypairFE_services.csv
* beta_ests_services.csv
* WB_internet.csv
* eora_trade_2016.csv
* country_year_coverage_v2_0.csv

**Outputs**
* A collection of 4 tables containing different types simulation results in different formats


# Datasets

## aggregate_trade_data.dta
Trade data and covariates used for the main econometric gravity estimations. 

### List of variables
* **year**: Year of observation
* **region_o**: Region of exporter
* **region_d**: Region of importer
* **importer**: Importer (encoded by Stata)
* **exporter**: Exporter (encoded by Stata)
* **tech_neutrality**: 1 if exporter and importer had a preferential trade agreement with a provision on technological neutrality in that year, 0 otherwise.
* **ec_duty_prohibit**: 1 if exporter and importer had a preferential trade agreement with a provision prohibiting duties on electronice commerce in that year, 0 otherwise.
* **electronic_sig**: 1 if exporter and importer had a preferential trade agreement with a provision on electronic authentication in that year, 0 otherwise.
* **data_protection**: 1 if exporter and importer had a preferential trade agreement with a provision related to data protection in that year, 0 otherwise.
* **landlocked_d**: 1 if importer is landlocked, 0 otherwise
* **cybersecurity**: 1 if exporter and importer had a preferential trade agreement with a provision related to cybersecurity in that year, 0 otherwise.
* **goods_big_data**: 1 if exporter and importer had a preferential trade agreement with a provision related to big data in goods exports in that year, 0 otherwise.
* **free_data_flow**:1 if exporter and importer had a preferential trade agreement with a provision promoting free data flows in that year, 0 otherwise.
* **agree_pta**: 1 if exporter and importer had a preferential trade agreement in that year, 0 otherwise (PTA_ijt)
* **distance**: Population weighted distance between exporter and importer
* **common_language**: 1 if exporter and importer share a common language, 0 otherwise
* **colony_of_origin_ever**: 1 if importer was ever a colony of exporter, 0 otherwise
* **colony_of_destination_ever**: 1 if exporter was ever a colony of importer, 0 otherwise
* **contiguity**: 1 if exporter and importer share a common border, 0 otherwise
* **internet_use_pair**: Internet use measure (IU_ijt)
* **minimum_bandwidth**: Bandwidth measure (BW_ijt)
* **undersea_c**: Undersea fiber optic cable count (FOS_ijt)
* **overland_c**: Overland fiber optic cable count (FOL_ij)
* **member_eu_joint**: 1 if both exporter and importer are members of the European Union, 0 otherwise (EU_ijt)
* **foreign**: 1 if trade flow is international (exporter != importer), 0 otherwise
* **foreign_XXXX**: 1 if trade flow is international and year == XXXX, 0 otherwise
* **trade**: Trade value from exporter to importer in year, in million $ (X_ijt)
* **imp_iso3**: Importer iso3 identifier as a string
* **exp_iso3**: Exporter iso3 identifier as a string
* **pair_sym**: Identifier for symmetric country pairs


## goods_services_trade_data.dta
Trade data for the main econometric gravity estimations, separated into total goods and total services flows.

### Variables
Same variables as aggregate_trade_data.dta except for the inclusion of 1 additional variable:
* **goods_svcs**: 'goods' if flow is goods trade, 'services' if flow is services trade

## digital_trade_sector.dta

### Variables
Same variables as aggregate_trade_data.dta except for inclusion of:
* **sector**: Disaggregated sector classification (see table A4)

  
## IT_use_split.dta

### Variables
Same variables as aggregate_trade_data.dta except for inclusion of:
* **high_it**: Equals 1 if IT input share is above median share in a given year, 0 otherwise.

## kee_tri.dta

### Variables
* **imp_iso3**: Importer iso3 identifier as a string
* **OTRI**: Value of overall trade restrictions index (tariffs plus non-tariff measures) compiled by Kee et al (2009).
* **high_tariff_ntms**: Equals 1 if OTRI value is above the median value, 0 otherwise.


## stri.dta

### Variables
* **imp_iso3**: Importer iso3 identifier as a string
* **stri_score**: Value of the Services Trade Restrictions Index (STRI) compiled by the Wold Bank and WTO (2020).
* **high_stri**: Equals one if STRI value is above median value, 0 otherwise.



## trade_counts.dta

### Variables
Same variables as aggregate_trade_data.dta except for inclusion of:
* **goods_svcs**: 'goods' if flow is goods trade, 'services' if flow is services trade
* **number_of_products**: Total number of products from exporter to importer in year
* **products_bytype**: Total number of goods or services products from exporter to importer in year 

And the removal of: **trade**


## all_countrypairFE_aggregate.csv
Estimated country-pair fixed effects using aggregate trade values. For use in Nigeria simulation.

### Variables
* **importer**: Country code for importer
* **exporter**: Country code for exporter
* **pairfe_all_provisions**: Estimated exporter-importer fixed effect values


## all_countrypairFE_services.csv
Estimated country-pair fixed effects using services trade values. For use in India-Japan simulation.

### Variables
* **importer**: Country code for importer
* **exporter**: Country code for exporter
* **pairfe_all_provisions**: Estimated exporter-importer fixed effect values


## all_xvalues_aggregate.csv
Collection of all regression covariates (x-values) used in the main regression specifications. For use in the two simulations.

### Variables
Variables included in aggregate_trade_data.dta as well as several additional variables:
* **bandwidth_index**: Bandwidth index value
* **high_int_ex**: Interaction between provision_index and an indicator that takes a value of 1 if the exporter is high income and zero otherwise.
* **low_int_ex**: Interaction between provision_index and an indicator that takes a value of 1 if the exporter is low income and zero otherwise.
* **provision_index**: Language use index
* **high_digital_ex**: Interaction between the provision index ([0,1] index based on number of provisions in a trade agreement) and an indicator that takes the value of 1 if the exporter is high income
* **low_digital_ex**: Interaction between the provision index ([0,1] index based on number of provisions in a trade agreement) and an indicator that takes the value of 1 if the exporter is low income
* **high_any_dig_ex**: Interaction between the provision indicator, which takes the value of one if there is at least 1 digital provision in an active trade agreement, and an indicator that takes the value of 1 if the exporter is high income
* **low_any_dig_ex**: Interaction between the provision indicator, which takes the value of one if there is at least 1 digital provision in an active trade agreement, and an indicator that takes the value of 1 if the exporter is low income
* **constant**: A constant equal to 1


## beta_ests_aggregate.csv
Beta coefficient and standard error estimates from the preferred econometric specification for aggregate trade. For use in the Nigeria simulation. 
### Variables
* **Column 1 (unlabeled)**: Variable name
* **Column 2 (unlabeled)**: Alternating rows of coefficient estimates (row with variable label) and standard erros (following row without label)


## beta_ests_services.csv
Beta coefficient and standard error estimates from the preferred econometric specification for services trade. For use in the India - Japan simulation.
### Variables
* **Column 1 (unlabeled)**: Variable name
* **Column 2 (unlabeled)**: Alternating rows of coefficient estimates (row with variable label) and standard erros (following row without label)


## country_year_coverage_v2_0.csv
List of ISO3 country codes and corresponding country names
### Variables
* **iso3**: ISO3 country code
* **country**: Country name
* **earliest_year**: First year it appears in Dynamic Gravity Dataset (DGD)
* **latest_year**: Latest year that it appears in DGD


## eora_trade_2016.csv
Bilateral foreign and domestic trade flows in 2016 derived from the Eora input-output database.
### Variables
* **exporter**: Exporter ISO3 country code
* **importer**: Importer ISO3 country code
* **eora_trade_total_$M**: Total aggregate trade from exporter to importer in $ millions
* **eora_trade_goods_$M**: Total goods trade from exporter to importer in $ millions
* **eora_trade_services_$M**: Total services trade from exporter to importer in $ millions


## WB_internet.csv
Data on internet connectivity from the World Bank Databank.
### Variables
* **Time**: Year
* **Time Code**: Alternative year identifier (YR[year])
* **Country Name**: Full country name
* **Country Code**: Country ISO3 code
* **Individuals using the Internet (% of population) [IT.NET.USER.ZS]**: Percent of population using the internet


